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RECOMBINANT 170 KD SUBUNIT LECTIN OF 
ENTAMOEBA HISTOLYTICA AND METHODS OF USE 

5 This invention was made, in part, with support 

supplied by the U.S. Government under Contracts AI 18841 
and AI 26649 awarded by the National Institutes of 
Health, The U.S. Government has certain rights in this 
invention. 

10 

Field of the Invention 

The invention concerns the use of epitope- 
bearing regions of the 170 kD subunit of Entainoejba 
histolytica Gal/GalNAc adherence lectin which are 

15 produced recombinantly in procaryotic systems in 

diagnosis and as vaccines. Thus, the invention relates 
to the determination of the presence, absence or amount 
of antibodies raised by a subject in response to 
infection by E. histolytica using these peptides and to 

20 vaccines incorporating them. This invention also 

particularly relates to reagents specific for a novel 
variant of the 170 kD subunit of E, histolytica 
Gal/GalNAc adherence lectin and to the gene (hgl3) which 
encodes this novel subunit form, which represents the 

25 third member of the multigene family encoding this 170 kD 
subunit . 

Background Art 

Entamoeba histolytica infection is extremely 

3 0 common and affects an estimated 480 million individuals 
annually. However, only about 10% of these persons 
develop symptoms such as colitis or liver abscess. The 
low incidence of symptom occurrence is putatively due to 
the existence of both pathogenic and nonpathogenic forms 

35 of the amoeba. As of 1988, it had been established that 
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the subjects who eventually exhibit symptoms harbor 
pathogenic "zymodemes" which have been classified as such 
on the basis of their distinctive hexokinase and 
phosphoglucomutase isoenzymes . The pathogenic forms are 
5 not conveniently distinguishable from the nonpathogenic 
counterparts using morphogenic criteria, but there is an 
almost perfect correlation between infection with a 
pathogenic zymodeme and development of symptoms, and 
between infection with a nonpathogenic zymodeme and 

10 failure to develop these symptoms. 

It is known that E. histolytica infection is 
mediated at least in part by the "Gal/GalNAc" adherence 
lectin which was isolated from a pathogenic strain and 
purified 500 fold by Petri, W.A, , et al . , J Biol Chem 

15 (1989) 264:3007-3012. The purified "Gal/GalNAc" lectin 
was shown to have a nonreduced molecular weight of 260 kD 
on SDS-PAGE; after reduction with beta-mercaptoethanol , 
the lectin separated into two subunits of 170 and 3 5 kD 
MW. Further studies showed that antibodies directed to 

20 the 170 kD subunit were capable of blocking surface 

adhesion to test cells (Petri, et al, J Biol Chem (1989) 
supra ) . Therefore, the 170 kD subunit is believed to be 
of primary importance in meditating adhesion. 

In addition, the 170 kD subunit is described as 

25 constituting an effective vaccine to prevent E. 

histolytica infection in U.S. Patent 5,004,608 issued 2 
April 1991, 

Studies of serological cross -reactivity among 
patients having symptomology characteristic of 

3 0 E. histolytica pathogenic infection, including liver 

abscess and colitis, showed that the adherence lectin was 
recognized by all sera tested (Petri, Jr., W.A. , et al . , 
Am J Med Sci (1989) 296:163-165) . The lectin heavy 
sxxbunit is almost universally recognized by immune sera 

35 and T-cells from patients with invasive amebiasis (Petri, 
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et al . / Infect Immun (1987) 55:2327-2331; Schain, et al . , 
Infect Immun (1992) 60:2143-2146) . 

DNA encoding both the heavy (170 kD) and light 
(35 kD) subunits have been cloned. The heavy and light 
5 subunits are encoded by distinct mRNAs (Mann, B,, et al . , 
Proc Natl Acad Sci USA (1991) £8:3248-3252) and these 
subunits have different amino acid compositions and amino 
terminal sequences . The sequence of the cDNA encoding 
the 170 kD subunit suggests it to be an integral membrane 

10 protein with a large cysteine-rich extracellular domain 
and a short cytoplasmic tail (Mann, B./ et al , , Proc Natl 
Acad Sci USA (1991) supra ; Tannich, et al . , Proc Natl 
Acad Sci USA (1991) 88.: 1849-1853 ) , The derived amino 
acid sequence of the 170 kD lectin shows that the 

15 extracellular domain can be divided into three regions on 
the basis of amino acid composition. The amino teirminal 
amino acids 1-187 are relatively rich in cysteine (3.2%) 
and tryptophan (2.1%) . Amino acid sequence at positions 
188-378 does not contain cysteine, and the amino acid 

20 sequence at positions 379-1209 contains 10.8% cysteine 
residues. The obtention of clones encoding the heavy 
chain subunit is further described in U.S. Patent 
5,260,429 issued 9 November 1993, the disclosure of which 
is incorporated herein by reference. In that patent, 

25 diagnostic methods for the presence of E. histolytica 

based on the polymerase chain reaction and the use of DNA 
probes is described. 

The heavy subunit is considered to be encoded 
by a multigene family (Mann, B., et al . , Parasit Today 

30 (1991) l.:173-176) . Two different heavy sxibunit genes, 
hgll and hgI2, have been sequenced by separate 
laboratories. While hgl2 was isolated from an HM-1:IMSS 
cDNA library in its entirety (Tannich, E. et al . Proc 
Natl Acad Sci USA (1991) 88.: 1849-1853) , hgll was isolated 

35 in part from an H-302:NIH cDNA library and in part by PCR 
amplification of the gene from the HM-1:IMSS genome 
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(Mann, B.J. et al . Proc Natl Acad Sci USA (1991) 88=3248- 
3252) . As the amino acid sequence of these two genes is 
87.6% identical (Mann, B.J. et al . Paras it Today (1991) 
7:173-176); the differences could be explained by strain 
5 variation alone. The presence of multiple bands 

hybridizing to an hgl probe on Southern blots, however, 
in consistent with the existence of a 170 k.Da subunit 
gene family (Tannich, E. et al. Proc Natl Acad Sci USA 
(1991) 88.: 1849-1853) . 

10 Monoclonal antibodies specifically 

immunoreactive with various epitope -bearing regions of 
the 170 kD heavy chain subunit have also been disclosed 
in U.S. Patent 5,272,058 issued 21 December 1993, the 
disclosure of which is incorporated herein by reference 

15 in its entirety. This application also describes use of 
these antibodies to detect the 170 kD heavy chain and the 
use of the 170 kD subunit to detect antibodies in serum 
or other biological samples . The experimental work 
described utilizes the native protein. Further 

20 characterization of these antibodies is described in a 
publication by Mann, B.J., et al . , Infect Immun (1993) 
61.: 1772 -1778 also incorporated herein by reference. 

Various immunoassay techniques have been used 
to diagnose £. histolytica, infection, ELISA techniques 

25 have been used to detect the presence or absence of 

E. histolytica antigens both in stool specimens and in 
sera, though these tests do not seem to distinguish 
between the pathogenic and nonpathogenic strains . In a 
seminal article, Root, et al . , Arch Invest Med (Mex) 

30 (1978) 2: Supplement 1:203, described the use of ELISA 
techniques for the detection of amoebic antigen in stool 
specimens using reibbit polyclonal antiserum, ajid various 
forms of this procedure have been used, some in 
conjunction with microscopic studies. Palacios , et al., 

35 Arch Invest Med (Mex) (1978) Supplement 1:203; Randall 
et al.. Trans Roy Soc Trop Med Hva (1984) 78.:593; Grundy, 
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Trans Rov Soc Trop Med Hva (1982) 76.:396; Ungar, Am J 
Trop Med Hyq (1985) 34,:465. These studies on stool 
specimens and on other biological fluids are summarized 
in Amebiasis: Human Infection by Entamoeba Histolytica , 
5 J. Ravdin, ed. (1988) Wiley Medical Publishing, pp. 646- 
648 . 

Conversely, amebic serology is also a critical 
component in the diagnosis of invasive amebiasis. One 
approach utilizes conventional serologic tests, such as 

10 the indirect hemagglutinin test. These tests are very 
sensitive but seropositivity is persistent for years 
(Krupp, I.M., Am J Trop Med Hva (1970) 19:57-62; Lobel, 
H.O. et al., Ann Rev Microbiol (1978) 12:379-347). Thus, 
healthy subjects may give positive responses to the 

15 assay, creating an undesirable high background. Similar 
problems with false positives are found in using 
immunoassay tests involving a monoclonal antibody and 
purified native 170 kD protein (Ravdin, J.I., et al . , J 
Infect Pis (1990) 162:768-772.) 

20 Recombinant E. histolytica proteins other than 

the 170 kD subunit have been usqd as the basis for 
serological tests. Western blotting using a recombinant 
form of the "52 kD serine-rich protein" was highly 
specific for invasive disease and had a higher predictive 

25 value (92 vs. 65%) than an agar gel diffusion test for 

diagnosis of acute amebiasis (Stanley, Jr., S.L., et al . , 
Proc Natl Acad Sci U.S.A. (1990) 87:4976-4980; Stanley, 
Jr., S.L., et al., JAMA (1991) 266 : 1984-1986) . However, 
the overall sensitivity was lower than for the 

30 conventional agar gel test (82% vs. 90-100%) , 

Thus, there remains a need for serological 
tests which will provide optimum sensitivity while 
minimizing the number of false positives retained. The 
present invention provides such a test by utilizing, as 
35 antigen, epitope-bearing portions of the 170. kD subunit 
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of the adherence lectin produced recombinant ly in 
procaryotic systems. 

It is particularly advantageous to use 
recombinantly produced, nonglycosylated peptides or 
5 proteins in this assay since these peptides are easily 
and efficiently obtained and are easily standardized. 
Furthermore, since selected portions of the lectin heavy 
chain subunit can be produced, epitopes characteristic of 
the pathogenic or nonpathogenic forms of E. histolytica, 

10 can be produced and used to distinguish these forms in 

the assays. Subsequent to the invention herein, a report 
of immunoreactivity of recombinant 170 kd lectin with 
immune sera was published by Zhang, Y, et al . J. Clin 
Micro- Immunol (1992) 2788-2792.. Applicants incorporate 

15 by reference their own publication: Mann, B.J et al . 
Infect and Immun (1993) 61: 1772-1778. 

Similarly, although it is known that the 170 kD 
subunit may be used as a vaccine as described in the 
above -referenced U.S. Patent 5 , 004 , 608 , recombinantly 

20 produced forms of the 170 kD subunit, specifically those 
obtained from procaryotic cells that lack glycosylation 
may offer advantages in reproducibility of product and in 
ease of preparation of subunit vaccines. The present 
invention is directed to this desirable result. 

25 

Disclosure of the Invention 

The invention provides diagnostic tests which 
permit the assessment of patients for invasive £. 
histolytica infection and vaccines for prevention of 

30 infection. The invention also provides a novel third 

variant of the 170 kD subunit of the Gal/GalNAc adherence 
lectin and a gene (hgI3) which encodes this novel 
protein. Accordingly, the diagnostic tests of the 
invention are based on the genetic sequences of all three 

35 variants of the 170 kD subunit of the Gal/GalNAc 
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adherence lectin which are encoded by three different 
genes in a multigene family. 

Pathogenic and nonpathogenic strains can be 
distinguished by use of the invention diagnostic method, 
5 if desired. The tests use, as antigen, an epitope- 

bearing portion of the 170 kD subunit of the Gal/GalNAc 
adherence lectin recombinantly produced in procaryotic 
systems. Despite the absence of glycosylation from such 
portions and despite the lack of post- translational 

10 modifications characteristic of the native protein or 
peptide, the recombinantly produced proteins are 
effective antigens in these assays. 

Thus, in one aspect, the invention is directed 
to a method to detect the presence or absence of 

15 antibodies immunoreactive with pathogenic and/or. 

nonpathogenic E. histolytica, in a biological sample which 
method comprises contacting the fluid with an epitope - 
bearing portion of the 170 kD heavy chain of the 
Gal/GalNAc adherence lectin wherein the lectin is 

20 nonglycosylated and in a form obtainable from procaryotic 
cells. If distinction between antibodies to the 
pathogenic and nonpathogenic forms is desired, the 
portion may be chosen so as to be characteristic of the^ 
pathogenic or nonpathogenic form. Alternatively, the 

25 assay may be conducted as a competition assay using ^4Abs 
with such characteristics. The contacting is conducted 
under conditions where the epitope -bearing portion forms 
complexes with any antibodies present in the biological 
fluid which are immunoreactive with an epitope on the 

30 portion. The presence, absence or amount of such 
complexes is then assessed, either directly or in a 
competition format, as a measure of the antibody 
contained in the biological sample . The invention is 
also directed to materials and kits suitable for 

35 performing the methods of the invention. 
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In a second aspect, the invention is directed 
to methods to prevent E. histolytica, infection using 
vaccines containing, as active ingredient, epitope- 
bearing portions of the 170 kD subunit produced 
5 recombinant ly in procaryotic systems, as described above. 
The invention is also directed to vaccines containing 
this active ingredient, 

In other aspects, the invention is directed to 
epitope -bearing portions of the 170 kD subunit produced 

10 recombinantly in procaryotic systems and thus in a form 
characteristic of such production. One characteristic is 
lack of glycosylation; in addition, secondary structure 
of proteins produced by procaryotic hosts differs from 
that of proteins produced by the natural source. 

15 In yet another aspect, the invention is 

directed to a DNA in purified and isolated form which 
consists essentially of a DNA encoding the 170 kd heavy 
chain subunit of pathogenic E. histolytica Gal/GalNAc 
adherence lectin, which subunit is encoded by the hgl3 

2 0 gene for which the nucleotide sequence and deduced amino 

acid sequence are shown in Figure 4. In further aspects, 
the invention is directed to both nucleic acid and 
immunological reagents which are enabled by the discovery 
of the hgl3 gene, reagents which are specific for each of 
25 the hgll, hgl2 or hgl3 genes, as well as reagents which 
detect common regions of all three hgl genes or their 
nucleic acid or protein products. For example, 
oligonucleotide probes specific for any one of these 
three genes or for a sequence common to all three genes 

3 0 may be identified by one of ordinary skill in the art, 

using conventional nucleic acid probe design principles, 
by comparisons of the three DNA sequences for these 
genes . See Example 6 . 

In still further aspects, the invention is directed 
3 5 to a method to detect the presence, absence, or amount of 
a pathogenic or nonpathogenic form of Entamoeba 
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histolytica, where E. histolytica has both pathogenic and 
nonpathogenic forms, in a biological sample, which method 
comprises contacting the sample with a monoclonal 
antibody immunospecif ic for an epitope of the 170 kd 
5 subunit of Gal/GalNAc lectin unique to the pathogenic or 
to the nonpathogenic form, or shared by the pathogenic 
and nonpathogenic forms of E, histolytica, to form an 
immunocomplex when the pathogenic and/or nonpathogenic 
form is present, and detecting the presence, absence or 

10 amount of the immunocomplex. In this method, the epitope 
is selected to be specific for one of 170 kD subunits 
encoded by the hgll, hgl2 or hgl3 genes, or for a common 
region of the subunits from all three hgl genes. 
In another aspect, the invention is directed to a method 

15 to determine the presence, absence or amount of 
antibodies specifically immunoreactive with the 
Gal/GalNAc lectin derived from E. histolytica, which 
method comprises contacting a biological sample with the 
Gal/GalNAc lectin or the 170 kd subunit thereof in 

20 purified and isolated form, under conditions wherein 

antibodies immunospecif ic for said lectin or subunit will 
forma complex, and detecting the presence, absence or 
amount of the complex, wherein the purified and isolated: 
Gal/GalNAc lectin or subunit is derived from either a 

25 pathogenic or nonpathogenic form of E. histolytica, and 
is a 170 kD subunit encoded by one of the hgll, i3gi2 or 
hgl3 genes. Detailed descriptions of these and related 
methods for detecting pathogenic or nonpathogenic forms 
of E. histolytica and antibodies specifically 

30 immunoreactive with the Gal/GalNAc lectin derived from E. 
histolytica, as well as reagent kits suitable for the 
conduct of such methods, are disclosed in U.S. Patent 
5,272,058, the entire disclosure of which is incorporated 
herein by reference. 



35 
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B-rief Description of the Drawings 

Figure lA shows the DNA and amino acid sequence 
deduced from the nucleotide sequence corresponding to the 
170 kD heavy chain of the adherence lectin from 
5 pathogenic strain HM1:IMSS, designated hgll. 

Figure IB shows the deduced amino acid sequence 
of hgll with the amino- terminal amino acid of the mature 
protein designated as amino acid number 1. 

Figure 2A is a diagram. of the construction of 
10 expression vectors for recombinant production of 

specified portions of the 170 kD subunit; Figure 2B shows 
the pattern of deletion mutants. 

Figure 3 is a diagram of the location of human 
B cell epitopes and pathogenic-specific epitopes on the 
15 170 kD heavy chain. 

Figure 4 A shows the DNA and amino acid sequence 
deduced from the nucleotide sequence corresponding to the 
170 kD heavy chain of the adherence lectin from 
pathogenic strain HM1:IMSS, designated hgI3 . 
20 Figure 4B shows the deduced amino acid sequence 

of hgI3 with the amino -terminal amino acid of the mature 
protein designated as amino acid number 1. The putative 
signal sequence and transmembrane domains are overlined 
and underlined respectively. Conserved cysteine residues 
25 .{•) and potential sites of glycosylation (*) are 
indicated. 

Figure 5 shows in schematic form a comparison 
of amino acid sequences of three heavy subunit genes. 
The top diagram represents a schematic representation of 

30 a heavy subunit gene. Starting at the amino terminus, 
regions include the cysteine/ tryptophan (C-W) rich 
domain, the cysteine-f ree (C-free) domain, the cysteine- 
rich (C-rich) domain, and the putative transmembrane (TM) 
sequence and cytosolic domains (Mann, B.J. et al. Parasit 

3 5 Today (1991) 7:173-176) . Amino acid sequence comparisons 
of hgll, hgl2 and hgl3 are shown. Upright lines indicate 
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nonconservative amino acid substitutions in the amino 
acid sequence of the second gene as compared to the first 
gene listed to the right. Downward arrowheads indicate a 
deletion while upright arrowheads indicate an insertion. 
5 The number of residues inserted or deleted are listed 
below the arrowheads and the total percent amino acid 
sequence identity is listed at right- 

Modes of Carrying Out the Invention 

10 The invention provides methods and materials 

which are useful in assays to detect antibodies directed 
to pathogenic and/or nonpathogenic forms of E. 
histolytica and in vaccines. The diagnostic assays can 
be conducted on biological samples derived from subjects 

15 at risk for infection or suspected of being infected. 

The assays can be designed to distinguish pathogenic from 
nonpathogenic forms of the amoeba if desired. The 
vaccines are administered to subjects at risk for amebic 
infections. 

20 The assays of the invention rely on the ability 

of an epitope -bearing portion of the 170 kD subunit 
produced recombinantly in procaryotic cultures to 
immunoreact with antibodies contained in biological 
samples obtained from individuals who have been infected 

25 with E. histolytica. Even though the relevant peptide or 
protein is produced in a procaryotic system, and is thus 
not glycosylated or processed after t::V iinslation in a 
manner corresponding to the native protein, the epitope- 
bearing portions thus prepared are useful antigens in 

3 0 immunoassays performed on samples prepared from 

biological fluids, cells, tissues or organs, or their 
diluted or fractionated forms. Similarly, these peptides 
are also immunogenic . 

The use of recombinant forms of the antigen or 

35 offers advantages of cost-effective, reliable production 
of pure antigen, thus assuring the xiniformity of the 
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assay materials. Recombinant production in bacteria is a 
particularly efficient and useful method. It is 
surprising that such procaryotic systems can produce 
successful antigens and immunogens, since the peptides 
5 produced are not processed in a manner analogous to the 
reactive native forms. 

Furthermore, recombinant production facilitates 
the preparation of specific epitopes, thus providing a 
means for detecting antibodies specifically 

10 immunoreactive with pathogenic or nonpathogenic forms of 
the amoeba, as well as offering the opportunity to 
provide subunit vaccines. 

Thus, the invention is directed to methods to 
detect antibodies in biological samples and to immunize 

15 subjects at risk using these recombinantly produced 
epitope-bearing portions as antigens or immunogens as 
well as to the recombinantly produced peptides 
themselves and to materials useful in performing the 
assays and in administering the vaccines. 

20 

Definitions 

The diagnostic assays may be designed to 
distinguish antibodies raised against nonpathogenic or 
pathogenic forms of the amoeba. "Pathogenic forms" of E. 

25 histolytica refers to those forms which are invasive and 
which result in symptomology to infected subjects. 
"Nonpathogenic forms" refers to those forms which may be 
harbored asymptomatically by carriers. 

The assays and vaccines of the invention 

3 0 utilize an epitope-bearing portion of the 170 kD subunit 
of the Gal/GalNAc lectin, "Gal/GalNAc lectin" refers to 
glycoprotein found on the surface of E. histolytica which 
mediates the adherence of the amoeba to target cells, and 
which mediation is inhibited by galactose or N- 

35 acetylgalactosamine. The Gal/GalNAc lectin refers 
specifically to the lectin reported and isolated by 
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Petri/ et al . ( supra ) from the pathogenic strain HMI- 
IMSS/ and to the corresponding lectin found in other 
strains of E. histolytica. The "170 kD subunit" refers 
to the large subunit, upon reduction of the Gal/GalNAc 
5 lectin, such as that obtained by Petri, et al . and shown 
in Figure 1 as well as to its corresponding counterparts 
in other strains . 

Diagnostic Assays 

10 With respect to the diagnostic assays of the 

invention, the complete 170 kD antigen or an epitope- 
bearing portion thereof can be used in the assays. Such 
epitope-bearing portions can be selected as 
characteristic of pathogens or nonpathogens or common to 

15 both. 

As shown hereinbelow, the portion of the 170 kD 
protein which contains epitopes for all monoclonal 
antibodies prepared against the lectin is found at amino 
acid positions 596-1138, There appears to be an epitope 

20 characteristic of pathogens between each of amino acid 
positions 596-818, 1082-1138, and 1033-1082. Positions 
895-998 contain epitopes which are shared by pathogens 
and nonpathogens as well as epitopes characteristic of 
pathogenic strains. Thus, to utilize fragments of the 

25 recombinantly produced protein for detection of 

antibodies, a peptide representing positions 596-818, 
1033-1082 or 1082-1138 may be used to detect antibodies 
raised against pathogens by hosts in general; however, 
the epitope at positions 596-816 is not recognized by 

30 human antisera. Mixtures of these peptides could also be 
used. Alternatively, longer forms of the antigen can be 
used by selecting the appropriate positions depending on 
whether pathogenic and nonpathogenic amoebae are to be 
distinguished. 

35 As shown in Example 4, below, epitope-bearing 

. portions relevant for human testing include portions 2- 
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482, 1082-1138, 1032-1082 and 894-998. Only the portions 
represented by 1082-1138 and 1032-1082 appears specific 
for antibodies against pathogenic ameba. These epitope- 
bearing portions may be used as single peptides, as 
5 uniquely lectin-derived portions of chimeric proteins, as 
mixtures of peptides or of such proteins, or as portions 
of a single, multiple-epitope-bearing protein. 
Procedures for preparing recombinant peptide proteins 
containing only a single epitope-bearing portion 
10 identified above, or multiples of such portions 

(including tandem repeats) are well understood in the 
art . 

The assays are designed to detect antibodies in 
biological samples which are "immunospecif ic" or 

15 "immunoreactive" with respect to the epitope-bearing 
portion i.e. with respect to at least one epitope 
contained in this portion. As used herein, 
" immunospecif ic" or " immunoreactive" with respect to a 
specified target means that the antibody thus described 

20 binds that target with significantly higher affinity than 
that with which it binds to alternate haptens. The 
degree of specificity required may vary with 
circumstances, but typically an antibody immunospecif ic 
for a designated target will bind to that target with an 

25 affinity which is at least one or two, or preferably 

several orders or magnitude greater than with which it 
binds alternate haptens. 

The assays can be performed in a wide variety 
of protocols depending on the nature of the sample, the 

30 circumstances of performing the assays, and the 
particular design chosen by the clinician. The 
biological sample is prepared in a manner standard for 
the conduct of immunoassays; such preparation may involve 
dilution if the sample is a biological fluid, 

35 fractionation if the sample is derived from a tissue or 
organ; or other standard preparation procedures which are 
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known in the art. Thus, "biological sample" refers to 
the sample actually used in the assay which is derived 
from a fluid, cell, tissue or organ of a subject and 
prepared for use in the assay using the standard 
5 techniques. Normally, plasma or serum is the source of 
biological sample in these assays. 

The assays may be conducted in a competition 
format employing a specific binding partner for the 
epitope-bearing portion. As used herein, "specific 

10 binding partner" refers to a substance which is capable 
of specific binding to a targeted substance, such as the 
epitope-bearing portion of the 170 kD subunit . In 
general, such a specific binding partner will be an 
antibody, but any alterative substance capable of such 

15 specific binding, such as a receptor, enzyme or 

arbitrarily designed chemical compound might also be 
used. In such contexts, "antibody" refers not only to . 
immunoglobulin per se, but also to fragments of 
immunoglobulin which retain the immunospecif icity of the 

2 0 complete molecule. Examples of such fragments are well 
known in the art, and include, for example. Fab, Fab' , 
and F (ab' ) 2 ^^3^9^®^^® . The term "antibody" also includes 
not only native forms of immunoglobulin, but forms of the 
immunoglobulin which have been modified, as techniques 

25 become available in the art, to confer desired properties 
without altering the immunospecif icity . For example, the 
formation of chimeric antibodies derived from two species 
is becoming more practical. In short, "antibodies" 
refers to any component of or derived form of an 

30 immunoglobulin which retains the immunospecif icity of the 
immunoglobulin per se. 

A particularly useful form of specific binding 
reagents useful in the assay methods of the invention is 
as monoclonal antibodies. Three categories of monoclonal 

35 antibodies have been prepared to the 170 kD subunit. One 
category of antibody is immunospecif ic for epitopes 
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"unique" to pathogenic forms. These antibodies are 
capable, therefore, of immunoreaction to a significant 
extent only with the pathogenic forms of the amoeba or to 
the 170 kD subunit of lectin isolated from pathogenic 
5 forms. A second set of monoclonal antibodies is 
immunoreactive with epitopes which are "unique" to 
nonpathogenic forms. Thus, these antibodies are 
immunoreactive to a substantial degree only with the 
nonpathogenic amoeba or their lectins and not to the 
10 pathogenic forms. A third category of monoclonal 

antibodies is immunoreactive with epitopes common to 
pathogenic and nonpathogenic forms and these antibodies 
are capable of immunoreaction with the subunit or with 
the amoeba regardless of pathogenicity. 
15 With respect to the monoclonal antibodies 

described herein, those immunoreactive with epitopes 1 
and 2 of the 170 kD subunit isolated from the pathogenic - 
strain exemplified are capable of reacting, also, with 
the corresponding epitopes on nonpathogens . On the other 
20 hand, those immunoreactive with epitopes 3-6 are capable 
of immunoreaction only with the 170 kD subunit of 
pathogenic strains . By applying the techniques for 
isolation of the pathogenic 170 kD subunit to amoeba 
which are nonpathogenic, a 170 kD subunit can be obtained 
25 for immunization protocols which. permit the analogous 
preparation of MAbs immunoreactive with counterpart 
epitopes 3-6 in the nonpathogenic forms. 

Of course, with respect to antibodies found in 
the biological sample, in general, these will be found in 
30 the form of immunoglobulins. However, pretreatment of 
the sample with an enzyme, for example, to remove the 
portions of the antibodies contained therein, does not 
debilitate the sample with respect to its ability to 
respond to the assay. 



35 
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Assay Procedure 

For the conduct of the assays of the invention, 
in general, the biological sample is contacted with the 
epitope-bearing portion used as an antigen in the 
5 immunoassay. The presence, absence or amount of the 

resulting complex formed between any antibody present in 
the sample and the epitope-bearing portion is measured 
directly or competitively. 

As is well understood in the art, once the 

10 biological sample is prepared, there is a multiplicity of 
alternative protocols for conduct of the actual assay . 
In one rather straightforward protocol, the epitope- 
bearing portion provided as antigen may be coupled to a 
solid support, either by adsorption or by covalent 

15 linkage, and treated with the biological sample. The 
ability of any antibodies in the sample to bind to 
coupled antigen is then determined. 

This ability may be determined in a "direct" 
form of the assay in which the level of complex formation 

20 by the antibody is measured directly. In one 

particularly convenient format of this approach, the 
antigen may be supplied as a band on a polyvinyl idene 
difluoride (PVDF) and contacted with the biological 
sample; any resulting complexes formed with antibody on 

25 the PVDF membrane are then detected as described above 
for Western blot procedure , This protocol is 
substantially a Western Blot procedure. Alternatively, 
microtiter plates or other suitable solid supports may be 
used. The binding of antibody to the antigen coupled to 

30 support can then be detected as described above for 
Western blot procedure using conventional techniques 
generally involving secondary labeling using, for 
example, antibodies to the species from which the 
biological sample is derived. Such labels may include 

3 5 radioisotopes, fluorescent tags, enzyme labels and the 
like, as is conventionally understood. 
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The assay may also be formatted as a 
competition assay wherein the antigen coupled to solid 
support is treated not only with the biological sample 
but also with competing specific binding partner 
5 immunospecif ic for at least one epitope contained in the 
antigen. The competing binding partner is preferably an 
antibody. The competing antibody may be polyclonal or 
monoclonal and may itself be labeled or may be capable of 
being labeled in a secondary reaction. In a typical 

10 conduct of such a competitive test, a competitive 

specific binding partner for the antigen is generally 
supplied in labeled form and the success of the 
competition from the biological sample is measured as a 
reduction in the amount of label bound in the resulting 

15 complex or increased levels of label remaining in the 
supernatant. If monoclonal antibodies are used, the 
assay can readily be made specific for pathogenic or 
nonpathogenic reacting antibodies, if desired, by 
choosing antibodies of the appropriate specificity. 

20 Thus, if the assay is to be made specific for antibodies 
raised against pathogenic forms of E. histolytica, the 
competition will be provided by a monoclonal antibody 
specific for an epitope characteristic of pathogenic 
strains. 

25 Another manner in which the assay may be made 

specific for pathogenic or nonpathogenic forms is in the 
choice of the epitope -bearing portion. If antibodies 
specific to the pathogens are to be detected, an epitope- 
bearing portion is chosen which bears only epitopes 

30 characteristic of pathogenic strains. Conversely, 
antibodies immunospecif ic for nonpathogens can be 
conducted by utilizing as antigen only portions of the 
subunit which contain epitopes characteristic of 
nonpathogens. Where characterization as pathogen or 

35 nonpathogen- specif ic antibodies is unnecessary, antigen 
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containing both such epitopes or epitopes shared by both 
forms may be used. 

Additional ways to distinguish between 
antibodies immunospecif ic for pathogens and for 
5 nonpathogens employ competition assays with monoclonal 
antibodies of such specificities, as described above. 

Alternatively, the biological sample can be 
coupled to solid support and the desired epitope-bearing 
portion added under conditions where a complex can be 
10 formed to the epitope-bearing portion, which is then used 
to treat the support . Subsequent treatment of the 
support with antibodies known to immunoreact with the 
antigen can then be used to detect whether antigen has 
been bound. 

15 Thus, the biological sample to be tested is 

contacted with the epitope-bearing portion, which is 
derived either from a pathogenic or nonpathogenic from 
one both of E. histolytica, so that a complex is formed. 
The complex is then detected by suitable labeling, either 

20 by supplying the antigen in labeled from, or by a 

secondary labeling process which forms a ternary complex. 
The reaction is preferably conducted using a solid phase 
to detect the formation of the complex attached to solid- 
support, or the complex can be precipitated using 

25 conventional precipitating agents such as polyethylene 
glycol . 

In a more complex form of the assay, 
competitive assays, can be used wherein the biological 
sample, preferably serum or plasma, provides the cold 

30 antibody to compete with a specific binding partner, such 
as a labeled monoclonal antibody preparation known to 
bind specifically to an epitope unique to the Gal/GalNAc 
lectin or its 170 kD subunit of a pathogenic or 
nonpathogenic from. In this embodiment, the binding to 

35 labeled specific monoclonal antibody is conducted in the 
presence and absence of biological sample, and the 
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diminution of labeling of the resulting complex in the 
presence of sample is used as an index to determine the 
level of competing antibody. 

Kits suitable for the conduct of these methods 
5 include the appropriate labeled antigen or antibody 

reagents and instructions for conducting the test. The 
kit may include the antigen coupled to solid support as 
well as additional reagents. 

10 Methods of Protection and Vaccines 

The recombinant 170 kD subunit or an epitope- 
bearing portion thereof may be used as active ingredient. 
Preferred regions include positions 482-1138, 596-1138, 
885-998, 1033-1082 and 1082-1138. 

15 The 170 kD subunit or its epitope-bearing 

regions may also be produced recombinantly in procaryotic 
cells for the formulation of vaccines. The recombinantly 
produced 170 kD protein or an epitope-bearing region 
thereof can be used as an active ingredient in vaccines 

20 for prevention of E. histolytica infection in subjects 
who are risk for such condition. Sufficiently large 
portions of the 170 kD protein can be used per se; if 
only small regions of the molecules for example 
containing 20 amino acids or less or to be used, it may 

25 advantageous to couple the peptide to a neutral carrier 
to enhance its immunogenicity . Such coupling techniques 
are well known in the art, and include standard chemical 
coupling techniques optionally effected through linker 
moieties such as those available from Pierce Chemical 

30 Company, Rockford, Illinois. Suitable carriers may 

include, for example, keyhole limpid hemocyanin (KLH) E. 
coli pilin protein k99, BSA, or the VP6 protein of 
rotavirus. Another approach employs production of fusion 
proteins which include the epitope-bearing regions fused 

35 to additional amino acid sequence. In addition, because 
of the ease with which recombinant materials can be 
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manipulated, the epitope -bearing region may be included 
in multiple copies in a single molecule, or several 
epitope-bearing regions can be "mixed and matched" in a 
single molecule. 
5 The. active ingredient, or mixture of active 

ingredients, in the vaccine is formulated using standard 
formulation for administration of proteins or peptides 
and the compositions may include an immunostimulant or 
adjuvant such as complete Freund's adjuvant, aluminum 

10 hydroxide, liposomes, ISCOMs, and the like. General 

methods to prepare vaccines are described in Remingtons' s 
Pharmaceutical Science ; Mack Publishing Company Easton, 
PA (latest edition) , The compositions contain an 
effective amount of the active ingredient peptide or 

15 peptides together with a suitable amount of carrier 

vehicle, including, if desired, preservatives, buffers, 
and the like. Other descriptions of vaccine formulations 
are found in "New Trends and Developments in Vaccines", 
Voller, A., et al., University Park Press, Baltimore, 

20 Maryland (1978) . 

The vaccines are administered as is generally 
understood in the art. Ordinarily, administration is 
systemic through injection; however, other effective 
means of administration are included. With suitable 

25 formulation, for example, peptide vaccines may be 

administered across the mucus membrane using penetrants 
such as bile salts or fusidic acids in combination, 
usually, with a surfactant. Transcutaneous means for 
administering peptides are also known. Oral formulations 

30 can also be used. Dosage levels depend on the mode of 

administration, the nature of the subject, and the nature 
of carrier /adjuvant formulation . Typical amounts of 
protein are in the range of -01 /xg-1 mg/kg. However, 
this is an arbitrary range which is highly dependent on 

3 5 the factors cited above. In general, multiple 
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administrations in standard immunization protocols are 
preferred; such protocols are standard in the art. 

A preferred epitope-bearing region of the 17 0 
kD subunit is that represented by amino acids 482-113 8 
5 which includes the cysteine-rich domain. This region is 
encoded by nucleotides 1492-3460 shown in Figure 1 
herein. Preferred regions include those bearing epitopes 
which are specific for antibodies against pathogenic 
amoeba i.e., regions 1082-1138 and 1032-1082. 
10 However, the epitope-bearing region at positions 894-998 
may also be used. For regions of this length, production 
of peptides with multiple copies of the epitope-bearing 
regions is particularly advantageous. 

15 Production of Recombinant Epitope- bearina Portions 

The epitope-bearing portions of the 170 kD 
subunit can be conveniently prepared in a variety of 
procaryotic systems using control sequences and hosts 
ordinarily available in the art. The portions may be 

20 provided as fusion proteins or as mature proteins and may 
be produced intracellularly or secreted. Techniques for 
constructing expression systems to effect all of these 
outcomes is well understood in the art. If the epitope- 
bearing portion is secreted, the medium can be used 

25 directly in the assay to provide the antigen, or . the : 
antigen can be recovered from the medium and further 
purified if desired. If the protein is produced 
intracellularly, lysates of cultured cells may be used 
directly or the protein may be recovered and further 

30 purified. In the Examples below, the epitope-bearing 
portion is provided as a fusion protein using the 
commercially available expression vector pGEX. 
Alternative constructions and alternative hosts can also 
be used as is understood in the art, 

35 

Reagents and assavs for a novel 170 kP lectin subunit 



wo 95/00849 



PCT/US94/06890 



- 23 - 

To determine the existence and complexity of 
the 170 kDa subunit gene family, hgl , an amebic genomic 
library in lambda phage was hybridized with DNA fragments 
from the 5' or 3 ' ends of hgll . Termini from three 
5 distinct heavy subunit genes were identified including 
hgll, hgl2, and a third, unreported gene designated hgl3 , 
The • open reading frame of hgl3 was sequenced in its 
entirety (Figure 4A) . Nonstringent hybridization of a 
genomic Southern blot with heavy subunit specific DNA 

10 labeled only those bands predicted by hgIl-3. The amino 
acid sequence of hglB (Figure 4B) was 95.2% identical to 
hgll and 89.4% identical to hgl2 . All 97 cysteine 
residues present in the heavy subunit were conserved in 
hgIl-3. Analysis of amebic RNA showed that all three 

15 heavy subunit genes were expressed in the amebae and that 
hgl message became less abundant as the amebae entered a 
stationary growth phase. 

Accordingly, the present invention provides 
both nucleic acid and immunological reagents specific for 

2 0 170 kDa subunits encoded by each of the hgll, hgI2 or 
hgl2 genes, as well as reagents which detect common 
regions of all three hgl genes and their nucleic acid or 
protein products. For example, oligonucleotide probes . 
specific for any one of these three genes may be 

25 identified by one of ordinary skill in the art, using 
conventional nucleic acid probe design principles, by 
comparisons of the three DNA sequences for these genes, 
which sequences are disclosed in Figure lA and Figure 4A 
for hgll and hgl3, respectively, and for hgl2 , in 

30 Tannich, E. et ai . Proc Natl Acad Sci USA (1991) 88 ;1849- 
1853, the entire disclosure of which is hereby 
incorporated herein by reference. Example 6 illustrates 
the use of oligonucleotide probes specific for each of 
the three hgl genes, for determining the level of 

35 expression of RNA from each gene using Northern blot 
analyses. Other methods of using hgl -specific nucleic 
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acids for diagnostic purposes, for pathogenic and/or 
nonpathogenic forms of E. histolytica, are described in 
U.S. Patent 5,260,429, the entire disclosure of which is 
incorporated herein by reference . 
5 The following Examples are intended to 

illustrate but not to limit the invention. 

Example 1 
Construction of Expression Vectors 
10 The 170 kD subunit of the galactose lectin is 

encoded by at least two genes. The DNA used for all of 
the constructions described herein encodes the 170 kD 
lectin designated hgll (Fig. lA) . The nucleotide 
position designations refer to the numbering in Figure 
15 lA. 

The DNA sequence encoding hgll was expressed in 
three portions: 

fragment C (nucleotides 46-1833) included the 
cysteine- and tryptophan-rich region, the cysteine-f ree 
20 region, and 277 amino acids of the cysteine-rich domain, 
i.e. amino acid residues 2-596; 

fragment A (nucleotides 1492-3460) encoded the 
majority of the cysteine-rich domain, i.e. amino acid 
residues 482-1138 ; 
25 fragment B (3461-3892) included 70 amino acids 

of the cysteine-rich domain, the putative membrane- 
spanning region, and the cytoplasmic tail, i.e. amino 
acid residues 1139-1276. 

See Fig. 2B. 

30 Each of these three fragments was inserted in 

frame by ligation into pGEX2T or pGEX3X to obtain these 
proteins as GST fusions. A diagram of the vectors 
constructed is shown in Figure 2A. 

Fragment C was produced by PGR amplification. 

3 5 Primers were designed so that a BairiHI site was added to 
the 5' end and- an EcoRI site was added to the 3' end 
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during the PGR process. The PGR product, fragment C, was 
then digested with restriction enzymes BainHI and EcoRl , 
purified, and ligated into similarly digested pGEX3X. 
Fragments A and B were produced by digestion with EcoRI 
5 from plasmid clones (Mann, BJ et al . Proc Natl Acad Sci 
USA (1991) 88 '3248-3252) and ligated into pGEX2T that had 
been digested with EcoRI . In the pGEX expression system 
a recombinant protein is expressed as a fusion protein 
with glutathione S- transferase (GST) from Schistosoma 

10 japonicum and is under the control of the tac promoter. 
The tac promoter is inducible by IPTG. The construction 
of the vectors and subsequent expression is further 
described in Mann, BJ et al . Infec and Immun (1993) 
61:1772-1778, referenced above, and incorporated herein 

15 by reference. 

Expression in the correct reading frame was 
verified for all constructs by sequencing and Western 
immunoblot analysis by testing for reactivity with anti- 
adhesion antisera (data not shown) • Expression of the 

20 hgll fusion proteins was shown to be inducible by IPTG. 

The GST protein produced from the original pGEX2T did not 
react with the anti -adhesion sera. The GST portion of 
the fusion protein has a molecular mass of 27.5 kD. 

25 Example 2 

Production of Recombinant Protein 
The four vectors described above, as well as 
the host vector were transfected into competent E. coli 
hosts and expression of the genes encoding the fusion 
3 0 proteins was effected by induction with IPTG. Production 
of the fusion proteins was determined by Western blot 
SDS-PAGE analysis of the lysates. 



35 



Example 3 

React ivitv of Recombinant 17 0 kP Subunit 
Fusion Proteins with MAbs 
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Induced cultures containing bacterial strains 
expressing hgll fragment A, B, or C were harvested, lysed 
in sample buffer, and applied to an SDS-polyacrylamide 
gel. After electrophoresis, the proteins were 
transferred to Immobilon and incubated with anti-170-kD 
MAbs , specific for seven different epitopes . 
Characteristics of the individual MAbs are shown in Table 
1. It will be noted that all the known epitopes are in • 
the region of amino acids 596-1138. 

TABLE 1 . Characteristics of monoclonal antibodies directed against 
the galactose adhesion 170 kD subunit 



15 



20 



Epitope # 


Designation 


Isotype^ 


Adherence^ 


2 

Cytotoxicity 


3 

C5b9 Resistance 




NP^ 


Location^ 


1 


3F4 


IgG^ 


Increases 


Decreases 


No effect 


+ 




895-998 


2 


8A3 


IgG^ 


Increases 


No effect 


Decreases 






895-998 


3 


7F4 


^^Sb 


No effect 


No effect 


Decreases 






1082-1138 


4 


8C12 




Inhibits 


Inhibits 


Decreases 






895-998 


5 


1G7 




Inhibits 


Inhibits 


Decreases 






596-818 


6 


H85 


^S^b 


Inhibits 


Inhibits 


Blocks 






1033-1082 


7 


3D12 


IgGj 


No effect 


Not tested 


Blocks 






895-998 



25 



30 



35 



40 



^Adherence was assayed by the^^^ding of Chinese hamster ovary (CHO) cells to E. histolytica 
trophozoites and by binding of I labeled purified colonic mucins to trophozoites. Petri, W.A. 
Jr., ct al., J Immunol (1990) 144:4803-4809. 

.The assay for cytotoxicity was CHO cell killing by E. histolytica trophozoites as measured by 
Cr release from labeled CHO cells. Saffer, L.D.. et al. Infect Immun (1991) 59:4681-4683. 

C5b9 resistance was assayed by the addition of puriHed complement components to E. histolytica 
trophozoites. The percent of amebic lysis was determined microscopically. Braga, L.L., et al. J 
Clin Invest (1992) 90:1131-1137. 

and NP refer to reactivity of the MAb with pathogenic (P) and nonpathogenic (NP) species of 
E. histolytica as determined in an Elisa assay. Petri, W.A. Jr., et al. Infect Immun (1991) 
58:1802-1806. 

^Location of antibody binding site by amino acid number. Results presented herein. 

^Inhibits adherence to CHO cells but not human colonic mucin glycoproteins. Petri, W.A. Jr., et 
al., J Immunol (1990) 144:4803-4809. 
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Fusion proteins B and C failed to react with 
any of the seven MAbs (data not shown) . Fusion protein 
A, representing positions 482-1132, reacted with all 
5 seven MAbs representing all 7 epitopes and not a negative 
control developed with an irrelevant MAb, M0PC21. The 
MMds were used at 10 ptg/ml and polyclonal antibodies at 
1:1000 dilution. These results indicated that these 
seven epitopes were contained within the 542 amino acids 
10 of the cysteine-rich extracellular domain of the 170 JcD 
subunit . 

The generation of 3' deletions by controlled 
ExoIII digestion of fragment A of the 170 kD subunit is 
outlined in Figure 2B. Al contains amino acid residues 

15 482-1082; A2 contains amino acid residues 482-1032; A3 
contains amino acid residues 482-998. The reactivities 
of the fusion proteins that include fragment A or either 
of two carboxy- terminal deletions (A3 and A4) with the 
seven distinct 170 kD- specific MAbs were determined. 

20 Deletion 3 reacted with MAb against epitopes 1-2, 4-5, 

and 7 but failed to react with MAbs recognizing epitopes 
3 and 6; Deletion 4 which contains residues 498-894 
reacted only with the MAb which recognizes epitope 5 . 

The five deletion derivatives of fusion protein 

25 A shown in Figure 2B, ranging in estimated size from 35 

to 68 kD, were tested for reactivity to each MAb, and the 
reactivities of the deletions with each MAb are 
summarized in Fig. 3. The endpoints of the various 
deletions were determined by DNA sequencing with primers 

3 0 specific for the remaining hgll sequence. MAbs 

recognizing epitopes 1 and 2, which increase amebic 
adherence to target cells, failed to react with 
recombinant lectin fusion proteins lacking amino. acids 
895 to 998. Similarly, MAbs recognizing epitope 4, an 

3 5 inhibitory epitope, and epitope 7, which has the effect 
of abrogating amebic lysis by complement, failed to react 
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with deletion mutants lacking this region. The MAb 
specific for epitope S, which has inhibitory effects on 
amebic adherence and abrogates amebic lysis by 
complement, did not react with a recombinant protein 
5 missing amino acids 1033 to 1082. Recombinant proteins 
lacking amino acids 1082 to 1138 did not react with a MAb 
which is specific for the neutral epitope 3. Finally, a 
construct containing amino acids 482 to 818 was 
recognized only by the adherence- inhibitory epitope 5 
10 MAb. The thus predicted locations of the MAb epitopes 
are listed in Table 1 above. 

Example 4 

Reactivity of 170 kP Fusion Proteins 

15 with Human Immune Sera 

Since the galactose adhesion is a major target 
of the humoral immune response in the majority of immune 
individuals, the mapping of human B-cell epitopes of the 
170 kD subunit was undertaken. The recombinant fusion 

20 proteins and ExoIII -generated deletion constructs of the 
170 kD subunit were tested for reactivity with pooled 
human immune sera in the same manner as described for MAb 
reactivity. Nonimmune sera was used as a control. 
Fusion proteins A and C reacted with immune sera, whereas 

25 fusion protein B did not (data not shown) . Human immune 
sera also reacted with deletion constructs Al, A2 , and A3 
but not with A4 or AlO, Reactivity of immune sera with 
the different deletions localized major human B-cell 
epitopes to be within the first 4 82 amino acids and 

3 0 between amino acids 895 and 1138 (Fig. 3) . This second 
region is the same area which contains six of the MAb 
epitopes. These results are consistent with a report by 
Zhang et al . supra, who found that sera from immune 
individuals reacted primarily with recombinant adhesion 

3 5 constructs containing amino acids 1 to 373 and 64 9 to 
1202 . 



wo 95/00849 



PCTAUS94/06890 



- 29 - 

Thus, for use in assays to detect human 
antisera against E. histolytica, the useful epitope- 
bearing portions are as shown in Table 2 . 

Table 2 

5 Positions .Epitope # P/NP 

2-482 ? ? 

1082-1138 3 P 

1033-1082 6 P 

895-998 1,2,4,7 both 

10 The epitope -bearing portions indicated can be 

used alone, as fragments or as portions of chimeric or 
fusion proteins, or any combination of these epitope - 
bearing portions can be used. 

15 Example 5 

Immunization Using Recombinant Subunit Protein 

A GST fusion protein with fragment A was 
prepared in E. coli as described in Example 1 above. 

20 This peptide contains an upstream GST derived peptide 
sequence followed by and fused to amino acids 432-1138 
encoded by nucleotides 1492-3460 in Figure 1 herein. The 
protein is produced intracellularly; the cells were 
harvested and lysed and the lysates subjected to standard 

25 purification techniques to obtain the purified fusion 
protein. 

Gerbils were immunized by intraperitoneal 
injection with 3 0 ;ig of purified fusion protein in 
complete Freund's adjuvant and then boosted at 2-4 weeks 
30 with 30 fig of the fusion protein in incomplete Freund's 
adjuvant . 

The gerbils were challenged at 6 weeks by 
intrahepatic injection of 5x10^ amebic trophozoites and 
sacrificed 8 weeks later. The presence and size of 
3 5 amoebic liver abscesses was determined. 
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The results of the two experiments described 
above are shown in the tables below. The administration 
of the fusion protein reduced the size of abscesses in a 
statistically significant manner, 
5 In experiment 1, six animals were used as 

controls and nine were administered the fusion protein; 
in experiment 2, seven animals were used as controls and 
seven were provided the fusion protein. 

10 Experiment 1 Experiment 2 

Abscess % with Abscess % with 

Weight Abscess \y eight Abscess 

Control i,44±1.64 71% 4.76±1,78 100% 

15 GST - (482-1138) 0.81±0.10' 100% 2.35±1,99 100% 

* P<0.03 compared to control. 
-I- P<0.24 compared to control. 

20 Example 6 

Analysis of the Gene Family Encoding the 170 kP Subunit 
of E. histolytica. Gal/GalNAc Adherence Lectin 
This Example shows that the adhesin 170 kDa 
subunit of HM-1:IMSS strain E. histolytica is encoded by 

25 a gene family that includes hgll, hgl2 and a previously 
undescribed third gene herein designated hgl3 . Since 
hgll and hgI2 were originally sequenced, in part, from 
different cDNA libraries, it was possible that they 
represented strain differences of a single gene. 

30 However, in this report both 5' and 3' termini of hgll, 
hgl2 , and hgI3 were isolated and sequenced from the same 
lambda genomic library demonstrating unambiguously that 
hgl is a gene family. 

Comparison of the amino acid sequences of the 

3 5 three heavy subunit genes showed that hgll and hgI2 are 
89.2% identical, hgll and hgI3 are 95.2% identical, and 
hg22 and hgI3 are 89.4% identical. Sequence variation 
within the gene family, however, appears to be 
nonrandomly distributed within the coding sequence. The 
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majority of the nonconservative amino acid substitutions 
as well as insertions and deletions occur in the amino 
third of the molecule. Comparison of the amino acid 
sequences of hgl2 and hgl3 reveal that 11 of the 19 
5 nonconservative amino acid substitutions and 11 of the 13 
residues inserted or deleted reside within the first 400 
amino acid residues, A similar pattern of variation is 
present when hgrll and hgl2 are compared. While hgrll and 
hgl3 contain only two nonconservative substitutions, both 

10 are found within the first 400 residues although the 57 
conservative substitutions appear to be more randomly 
distributed throughout the coding sequence. The high 
degree of sequence conservation between hgI3 and hgll 
suggest that they may have arisen from a recent gene 

15 duplication event. 

All 97 cysteine residues were maintained in the 
three heavy subunit genes. The hgl2 gene was originally 
reported lacking a single cysteine present in both hgll 
and hgl3 , However, this discrepancy has since been 

20 recognized as a sequencing error (Dr. E. Tannich, 
Bernhard Nocht Institute, Hamburg, Germany) . The 
cysteine residues are nonrandomly distributed throughout 
the gene (Figure 4) with the highest concentration within 
the cysteine-rich domain between amino acid residues 379- 

25 1210. All seven identified epitopes recognized by murine 
monoclonal antibodies map to this region (Mann, B.J- et 
al. Infect Immun (1993) £1:1772-1778). As these 
monoclonal antibodies can block target cell adhesion, 
target cell lysis (Saffer, L.D, et al . Infect Immun 

30 (1991) 59:4681-4683) , and/or resistance to host 

complement -mediated lysis (Braga, L.L. et al. J Clin 
Invest (1992) 90.: 1131-1137) , the conservation of cysteine 
residues may play an important role in maintaining the 
conformation of this important region of hgl, 

35 A minimum of three genes are shown to make up 

the heavy subunit gene family. While it is not possible 
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to rule out the existence of additional hgl genes, the 
Southern blot and library screen data can be explained by 
a gene family of three members. As the genomic library 
was screened separately with a 5' and a 3' hgl specific 
5 probe, additional heavy . subunit genes would be isolated 
even if they contained only partial identity with the 
gene family at only one end or even if one termini of an 
additional gene had been lost during library 
amplification. The library screen looked at more than 

10 3.2x10* bases of genomic DNA in an organism with an 

estimated genome size of 10^-^ bases (Gelderman, A.H. et 
al. :t Parasitol (1971) 57.: 906-911) . Thus, a full genomic 
equivalent was screened at low stringency for genes 
containing identity at either end. 

15 The Northern data indicated that all three genes 

were expressed in the amebae. As the messages of hgIl-3 
are predicted to comigrate at 4 . 0 kb, differential 
hybridization was required to ascertain expression of 
individual genes. Due to the high degree of identity 

20 between hgll-3, relatively short oligonucleotides (17-21 
bases) were synthesized specific for regions where the 
three genes diverge . Each probe was compared by computer 
analysis to the other hgl genes to be certain that they 
were sufficiently divergent to prevent cross 

25 hybridization. Hybridization and wash conditions were 
highly stringent for such A/T rich probes and were done 
at temperatures 5®C or less below the predicted Tm based 
upon nearest neighbor analysis. While it is impossible 
to rule out cross hybridization with other hg-I gene 

3 0 members, these precautions make such an event less 
likely- 

The Northern blot also indicates that abundance of 
mRNA for all three genes decreased as the amebae 
progressed from log to stationary growth- This finding 
35 correlates with data which indicate that late log cuid 
stationary phase amebae have a decreased ability to 
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adhere to, lyse, and phagocytose target cells (Orozco, E. 
et ai . (1988) "The role of phagocytosis in the pathogenic 
mechanism of Entamoeha histolytica , In: Amebiasis : 
Human infection bv Entamoeba histolytica. (Ravdin J.I., 
5 ed) , pp. 326-338. John Wiley & Sons, Inc., New York. 

Details of the experimental methods and results of 
the characterization of the hgl multigene family are 
presented- below. 

Library Screen , A lambda Zap® II library containing 

10 randomly sheared 4-5 kb fragments of genomic DNA from 
HM-1:IMSS strain E. histolytica was kindly provided by 
Dr. J, Samuelson at Harvard University (Kumar, A. et ai . 
Proc Natl Acad Sci USA (1992) 89:10188-10192) . Over 
80,000 placjues from the library were screened on a lawn 

15 of XL-1 Blue E. coli (Strategene, La Jolla, CA) . 
Duplicate plaque lifts, using Hybond-N membranes 
(Amersham, Arlington Heights, IL) , were placed in a 
prehybridization solution consisting of 6x SSC (.89 M 
sodium chloride and 90 mM sodium citrate) , 5x Denhardts 

20 solution, .5% SDS, 50 mM NaP04 (pH 6.7), and 100 fig/ml 

salmon sperm DNA for a minimum of 4 hours at 55*^0. A 5' 
and 3' DNA fragment of hgll (nucleotides 106-1946 and 
3522-3940 respectively) were [a-^^P] dCTP (Amersham) 
labeled using the Random Primed DNA labeling Kit 

25 according to the manufacturer's instructions (Boehringer 
Mannheim, Mannheim, Germany) and hybridized separately to 
the membranes overnight at 55*^0 in prehybridization 
solution. Membranes were rinsed once and washed once for 
15 minutes at room temperature in 2x SSC, .1% SDS, then 

30 washed once for 15 minutes at room temperature, and twice 
at 55*=*C for 20 minutes in • Ix SSC, .1% SDS. Plaques that 
hybridized with the 5' or the 3 radiolabeled probe on 
both duplicate filters were isolated and purified. 

Northern blot and hybridization . Total RNA was 

35 harvested from amebae using the guanidinium 

isothiocyanate method (RNagen, Promega, Madison, WI) , 
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Polyadenylated RNA was purified from total RNA using 
PolyATract System 1000 (Promega) . RNA was 
electrophoresed through a formaldehyde gel and 
transferred to a nylon Zetabind membrane (Cuno) using 25 
5 mM phosphate buffer (pH 7.5) as described (Sambrook, J. 
et al . (1989) Molecular Cloning: A laboratory manual . 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
New York) . The membrane was incubated in 
prehybridization solution and incubated at 37°C for at 

10 least two hours. Oligonucleotides (18-22 nucleotides 
long) were end- labeled using polynucleotide kinase and 
[7-P^^lATP (Sambrook, J. et al • (1989) Molecular Cloning: 
A laboratojTv manual . Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York) , added to the hybridization 

15 mixture and the membrane, and incubated at 3 7**C 

overnight . The membrane was then washed once at room 
temperature for 10 minutes, once at 37°C for 10 minutes; 
and twice at 40-44°C for 15 minutes each in 2x SSC, .1% 
SDS , The radiolabeled probes used were : 

20 5 ' - TTTGTCACTATTTTCTAC - 3 ' , hgl 1 ; 5 ' - TATCTCCATTTGGTTGA - 3 ' , 
hgl2 ; 5 ' -TTTGTCACTATTTTCTAC -3 ' , hgl 3 ; and 

5' -CCCAAGCATATTTGAATG-3' , EF-lOf (Plaimauer, B, et al . DNA 
Cell Biol (1993) 12:89-96). 

Characterization of the hql3 aene. The hgrl3 

25 open reading frame was 3876 bases and would result in a 
predicted translation product of 12 92 amino acids (Figure 
4) - The predicted translation products of hgll and i3gI2 
would be 1291 and 12 85 amino acids respectively. A 
putative signal sequence and a transmembrane domain were 

30 identified in the amino acid sequence of hgI3 similar to 
hgll and hgl2 . The amino -terminal amino acid sequence of 
the mature hgl3 protein, determined by Edman degradation 
(Mann, B.J, et al . Proc Natl Acad Sci USA (1991) 88 :3248- 
3252), was assigned residue number 1. Previous analysis 

3 5 of hgll and hgI2 identified a large, conserved, 

extracellular region which was 11% cysteine, designated 
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the cysteine-rich domain (Mann, B.J. et al . Parasit Today 
(1991) 7:173-176) (Fig. 2). Sequence analysis of hgl3 
x-evealed that all 97 cysteine residues present within 
this region were also conserved in both of the previously 
5 reported heavy subunit genes. 

A schematic comparison (Figure 5) of heavy 
subunit gene sequences revealed a high degree of amino 
acid sequence identity. However, seven sites, ranging 
from 3-24 nucleotides, were found where an insertion or 

10 deletion had occurred in one subunit relative to another, 
all of which maintained the open reading frame. Both 
hgll and iig23 contained a large number of nonconservative 
amino acid substitutions when compared to hgl2 , making 
them 89.2% and 89.4% identical to hgl2 respectively. 

15 While the comparison of hgll and hgI3 revealed only two 

nonconservative substitutions, 57 conservative amino acid 
substitutions and 3 single residue insertion/deletions 
making them 95.2% identical. 

All 16 potential sites of glycosylation present 

20 in hgll were conserved in hgl3 . A sequence analysis of 
hgI2 indicated that it contained only 9 such sites, 
although all 9 were present in hgll and hgl3 . 
Glycosylation appears to account for approximately 6% of 
the heavy subunits' apparent molecular mass (Mann, B.J. 

25 et al. Proc Natl Acad Sci USA (1991) 88 ; 3248-3252) . 

All three heavy subunits are expressed - Since 
hgI3 was isolated from a genomic library, it was unknown 
if this gene was transcribed. Polyadenylated RNA was 
harvested from amebae in both log and stationary phase 

30 growth. Probes specific for hgll, hgl2 , or hgl3 were 

hybridized to a Northern blot and identified an RNA band 
of the predicted size of 4.0 kb. 

As the messages of hgIl-3 are predicted to 
comigrate at 4.0 kb, differential hybridization was 

3 5 required to ascertain expression of individual genes 

using Northern blots. Due to the high degree of identity 
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between hgll-3, relatively short oligonucleotides (17-21 
bases) were synthesized specific for regions where the 
three genes diverge. Each probe was compared by computer 
analysis to the other hgl genes to be certain that they 
5 were sufficiently divergent to prevent cross 

hybridization. Hybridization and wash conditions were 
highly stringent for such A/T rich probes and were done 
at temperatures 5°C or less below the predicted Tm based 
upon nearest neighbor analysis. While it is impossible 
10 to rule out cross hybridization with other hgl gene 
members, these precautions make such an event less 
likely. 

The message abundance decreased significantly 
as the amebic trophozoites passed from log phase growth 

15 {lane A) to stationary phase growth (lane B) while the 
control gene, EF-lof, either remained constant or 
increased slightly. This finding correlates with data 
indicating that late log and stationary phase amebae have 
a decreased ability to adhere to, lyse, and phagocytose 

20 target cells (Orozco, E. et al . (1988) "The role of 

phagocytosis in the pathogenic mechanism of Entamoeba 
histolytica. In: Amebiasis: Human infection by 
Entamoeba, histolytica (Ravdin J.I., ed) , pp. 326-338. 
John Wiley & Sons, Inc., New York. 

25 Estimation of the number of heavy subunit genes . 

The observations herein confirm that the adhesin 170 kDa 
subunit of HM-1:IMSS strain E. histolytica is encoded by 
a gene family that includes hgll, hgl2 and a previously 
undescribed third gene which is designated hgl3 . Since 

3 0 hgll and hgl2 were originally sec[uenced, in part, from, 
different cDNA libraries, it was possible that they 
represented strain differences of a single gene. 
However/ in the present work both 5' and 3' termini of 
hgll, hgl2, and hgl3 were isolated and sequenced from the 

3 5 same lambda genomic library, demonstrating unambiguously 
that hgl is a gene family. 
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Comparison of the amino acid sequences of the three 
heavy subunit genes found that hgll and hgl2 are 89.2% 
identical, hgll and hgl3 are 95.2% identical, and hgl2 
and hgl3 are 89.4% identical. Sequence variation within 
5 the gene family, however, appears to be nonrandomly 

distributed within the coding sequence. The majority of 
the nonconservative amino acid substitutions as well as 
insertions and deletions occur in the amino third of the 
molecule. Comparison of the amino acid sequences of hgri2 

10 and hgl3 reveal that 11 of the 19 nonconservative amino 
acid substitutions and 11 of the 13 residues inserted or 
deleted reside within the first 400 amino acid residues. 
A similar pattern of variation is present when hgll and 
hgI2 are compared. While hgll and hgl3 contain only two 

15 nonconservative substitutions, both are found within the 
first 400 residues although the 57 conservative 
substitutions appear to be more randomly distributed 
throughout the coding sequence . The high degree of 
sequence conservation between hgI3 and hgll suggest that 

20 they may have arisen from a recent gene duplication 
event . 

All 97 cysteine residues were maintained in the 
three heavy subunit genes. The hgI2 gene was originally 
reported lacking a single cysteine present in both hgll 

25 and hgI3 . However, this discrepancy has since been 
recognized as a sequencing error (Dr. E. Tannich, 
Bernhard Nocht Institute, Hamburg, Germany, personal- 
communication) . The cysteine residues are nonrandomly 
distributed throughout the gene (Fig. 1) with the highest 

30 concentration within the cysteine-rich domain between 
amino acid residues 379-1210. All seven identified 
epitopes recognized by murine monoclonal antibodies map 
to this region (Mann, B.J. et al . Infect Immun (1993) 
£1:1772-1778) . As these monoclonal antibodies can block 

35 target cell adhesion, target cell lysis (Saffer, L.D. et 
al . Infect Immun (1991) 5^:4681-4683), and/or resistance 
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to host complement -mediated lysis (Braga, L.L. et al . J 
Clin Invest (1992) 1131-1137) , the conservation of 

cysteine residues may play an important role in 
maintaining the conformation of this important region of 
5 hgl. 

A minimum of three genes have been shown to make up 
the heavy subunit gene family, as described herein. 
While it is not possible to rule out the existence of 
additional hgl genes. Southern blot analyses and libraary 

10 screen data can best be explained by a gene family of 
three members. For Southern blots, two restriction 
enzymes were identified, Ddel and Hindlll, that cut 
genomic DNA to completion and resulted in analyzable 
restriction fragments. As the membrane was hybridized 

15 with a fragment of hgll corresponding to nucleotides 1556 
to 3522, two bands of >976 and 1965 nucleotides should 
have been present from hgl3 . This central hgll 
radioprobe would hybridize with three bands of 1158, 810 
and >1080 nucleotides from hgll and would hyribidze with 

20 five bands of 819, 312, 55, 755, and >1080 nucleotides 

from hgl 2 . The Southern blot showed 7 bands for genomic 
DNA disgested with Ddel, at 4200, 3700, 2100, 1800, 1300, 
840, and 760 nucleotides. As the 819 and 810 nucleotide 
bands would be expected to comigrate, all the bands 

25 observed with Ddel digestion are explained by the 
restriction maps of hgIl-3. 

Hindlll has no restriction sites in hgll-3 
within the coding region and would result in each gene 
being represented by a single band greater than 4.0 kb. 

30 The Southern blot showed three bands at 17500, 5600, and 
4200 nucleotides. Should an additional heavy subunit 
gene exist, its Ddel and Hindlll fragments would need to 
comigrate with hgll -3 bands, be so divergent that they 
failed to hybridize with the hgll probe under very low 

3 5 stringency, or be too large to be resolved and 
transferred. 
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As to the genomic screening data, the genomic 
library was screened separately with a 5' and a 3' hgl 
specific probe, additional heavy subunit genes would be 
isolated even if they contained only partial identity 
5 with the gene family at only one end or even if one 
termini of an additional gene had been lost during 
library amplification. The library screen looked at more 
than 3-2x10^ bases of genomic DNA in an organism with an 
estimated genome size of lO'^-^ bases (Gelderman, A-H, et 

10 al. J Parasitol (1971) 57:906-911). Thus, a full genomic 
equivalent was screened at low stringency for genes 
containing identity at either end. Of 7 clones 
identified with the 5' heavy subunit-specif ic probe, 4 
contained inserts that matched the reported sequence for 

15 hgll, 2 matched the sequence of hgI2, and 1 clone 

represented iigI3 . Of eight clones obtained using the 3' 
radiolabeled fragment, 1 matched the sequence for hgll, 5 
matched the sequence of hgl2 , and 2 represented hgI3 . No 
termini were found that did not match the sequence of 

20 hgll, hgI2 or hgI3 . 
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CLAIMS 

1. A method to detect Entamoeba histolytica 
antibodies in a biological sample which method comprises 

5 contacting said sample with an epitope -bearing portion of 
the 170 kD subunit of E. histolytica Gal/GalNAc adherence 
lectin under conditions wherein said portion forms a 
complex with any antibodies immunoreactive with said 
epitope present in said sample, and 

10 assessing the presence, absence or amount of 

said complex, 

wherein said epitope-bearing portion is 
nonglycosylated and is in a form obtained by recombinant 
production in a procaryotic host cell culture. 

15 . . 

2 . The method of claim 1 wherein said 
contacting is conducted by providing said epitope-bearing 
portion coupled to a solid support and treating said 
solid support with the biological sample, 

20 

3 . The method of claim 2 wherein said 
contacting and assessing steps are conducted in a Western 
blot procedure . 

25 4 . • The method of claim 2 which further 

comprises treating said solid support with a specific 
binding partner for said epitope under conditions wherein 
any antibody to said epitope in the biological sample 
competes with said specific binding partner for said 

30 epitope-bearing portion. 

5 - The method of claim 4 wherein said 
specific binding partner contains a detectable label and 
said assessing is conducted by measuring the effect of 
35 the presence of biological sample on the amount of label 
retained on the solid support. 
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6 . The method of claim 5 wherein said 
specific binding partner is an antibody or 
immunologically reactive portion thereof. 

5 7 . The method of claim 1 wherein said 

epitope-bearing portion contains a detectable label. 

8 . The method of claim 1 wherein said 
contacting is conducted by providing said biological 

10 sample coupled to a solid support and treating said 
support with said epitope-bearing portion. 

9 . The method of claim 1 wherein said 
epitope-bearing portion is characteristic of the 

15 pathogenic form of E. histolytica. 

10. The method of claim 1 wherein said 
epitope-bearing portion is characteristic of the non- 
pathogenic form of E. histolytica. 

20 

11 . The method of claim 1 wherein said 
epitope-bearing portion is characteristic of both the 
pathogenic and non-pathogenic form of E. histolytica. 

25 12. The method of claim 1 wherein said 

epitope-bearing portion consists essentially of amino 
acids of. said subunit as shown in Figure IB in positions 
selected from the group consisting of 2-482, 1082-1138, 
1033-1082, and 895-998, or the corresponding amino acids 

3 0 in a naturally occurring variant thereof. 

13. A nonglycosylated epitope-bearing portion 
of the 170 kD subunit at E. histolytica Gal/GalNAc 
adherence lectin in a form produced by recombinant 
3 5 procaryotic cell culture. 
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14. The epitope-bearing portion of claim 13 
which comprises the complete subunit , 

15. The epitope-bearing portion of claim 13 
which consists essentially of amino acids of said subunit 
as shown in Figure IB in positions selected from the 
group consisting of 2-482, 1082-1138, 1033-1082, and 895- 
998, or the corresponding amino acids in a naturally 
occurring variant thereof , 



16. An article of manufacture useful for the 
analysis of a biological sample for antibodies 
immunoreactive with E- histolytica, which article 
consists essentially of a solid support coupled to an 
15 epitope-bearing portion of the 170 kD subunit of E. 

histolytica Gal/GalNAc adherence lectin, which portion is 
nonglycosylated and which portion is in a form produced 
by recombinant procaryotic cell culture. 

20 17. A kit for the analysis of biological 

sample for antibodies immunoreactive with histolytica, 
which kit comprises an epitope-bearing portion of the 170 
kD subunit of E. histolytica Gal/GalNAc adherence lectin 
which is nonglycosylated and which is in a form produced 

25 by recombinant procaryotic cell culture, along with the 
reagents suitable for assessing the formation of a 
complex between any said antibody in said biological 
fluid and said epitope-bearing portion . 

30 18- A method to immunize a subject against . 

Entamoeba histolytica infection which method comprises 
administering to said subject an effective amount of an 
epitope-bearing portion of the 170 kD subunit of E. 
histolytica Gal/GalNAc adherence lectin which epitope- 

3 5 bearing portion is nonglycosylated and is in a form 
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obtained by recombinant production in a procaryotic host 
cell culture. 

19. The method of claim 18 wherein said 

5 epitope-bearing portion consists essentially of amino 

acids of said subunit as shown in Figure IB in positions 
selected from the group consisting of 482-1138, 596-1138, 
895-998, 1033-1082 and 1082-1138 or the corresponding 
amino acids in a naturally occurring variant of said 170 
10 kD subunit, 

20. The method of claim 19 wherein said 
positions are 1082-1138 or 1033-1082. 

15 21. The method of claim 19 wherein said 

positions are 482-1138 or 596-1138. 

22. A vaccine for immunizing a subject against 
E, histolytica infection which vaccine comprises a 

2 0 nonglycosylated epitope-bearing portion of the 170 kD 

subunit of E. histolytica Gal/GalNac adherence lectin in 
a form produced by recombinant procaryotic cell culture. 

23 . The vaccine of claim 22 wherein the 

25 epitope-bearing portion consists essentially of amino 

acids of said subunit as shown in Figure IB in positions 
selected from the group consisting of 895-998, 1033-1082 
and 1082-1138 or the corresponding amino acids in a 
naturally occurring variant of said 170 kD subunit. 

30 

24. The vaccine of claim 22 wherein said 
positions are 1082-1138 or 1033-1082. 

25. The vaccine of claim 22 wherein said 
35 positions are 482-1138 or 596-1138. 
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26. The epitope-bearing portion of claim 13 
which consists essentially of amino acids of said subunit 
as shown in Figure IB in positions 482-1138, or the 
corresponding amino acids in a naturally occurring 

5 variant thereof . 

27. The method of claim 19 wherein said 
positions are 482-1138. 

10 28. The vaccine of claim 22 wherein said 

positions are 482-1138. 

29. The method of claim 1 wherein said 
epitope-bearing portion is selected from the group 

15 consisting of: a portion which is characteristic of the 
subunit encoded by the hgll gene, a portion which is 
characteristic of the subunit encoded by the hgl2 gene, a 
portion which is characteristic of the subunit encoded by 
the hg'13 gene, and a portion which is shared by the 

20 subunit encoded by the hgil, hgI2 and hgI3 genes. 
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1 ATG AAA TTA TTA TTA TTA AAT ATC TTA TTA TTA TGT TGT CTT 42 
MKLLLLNILLLCCL 

43 GCA GAT AAA CTT GAT GAA TTT TCA GCA GAT AAT GAC TAT TAT 84 
ADKLDEFSA DNDYY 

85 GAC GGT GGT ATT ATG TCT CGT GGA AAG AAT GCA GGT TCA TGG 126 
DGGIMSRGKNAGSW 

127 TAT CAT TCT TAC ACT CAC CAA TAT GAT GTT TTC TAT TAT TTA 168 
YH SYTHQYDVFYYL 

169 GCT ATG CAA CCA TGG AGA CAT TTT GTA TGG ACT ACA TGC GAT 210 
AMQPWRHFVWTTCD 

211 AAA AAT GAT AAT ACA GAA TGT TAT AAA TAT ACT ATC AAT GAA 252 
KND NTECYKYTINE 

253 GAT CAT AAT GTA AAG GTT GAA GAT ATT AAT AAA ACA AAT ATT 294 
D H NV K V E D I N K T N I 

295 AAA CAA GAT TTT TGT CAA AAA GAA TAT GCA TAT CCA ATT GAA 336 
KQDFC QKEYAYPIE 

337 AAA TAT GAA GTT GAT TGG GAC AAT GTT CCA GTT GAT GAA CAA 378 
K Y E V D W DNV P V D E Q 

379 CGA ATT GAA ACT GTA GAT ATT AAT GGA AAA ACT TGT TTT AAA 420 
R IESVDINGKTCFK 

421 TAT GCA GCT AAA AGA CCA TTO GCT TAT GTT TAT TTA AAT ACA 462 
YAAKRPLAYVY LNT 

463 AAA ATG ACA TAT GCA ACA AAA ACT GAA GCA TAT GAT GTT TGT 504 
KMTYAT KTEAYDVC 

505 AGA ATG GAT TTC ATT GGA GGA AGA TCA ATT ACA TTC AGA TCA 546 
RMDFZGGRSITFRS 

547 TTT AAC ACA GAG AAT AAA GCA TTT ATT GAT CAA. TAT AAT ACA 588 
FNTENKAP IDQYNT 

589 AAC ACT ACA TCA AAA TGT CTT CTT AAT GTA TAT GAT AAT AAT 630 
NTTSKCLLNVYDMN 

631 GTT AAT ACA gAT CTT GCA ATT ATC TTT GGT ATT ACT GAT TCT 672 
VNTHLAIIFGITDS 

673 ACA GTC ATT AAA TCA CTT CAA GAG AAT TTA TCT CTT TTA AGT 714 
TVIKSLQENLSLLS 

715 CAA CTA AAA ACA GTC AAA GGA GTA ACA CTC TAC TAT CTT AAA 756 
QLKTVKGVTLY YLK 

757 GAT GAT ACT TAT TTT ACA GTT AAT ATT ACT TTA GAT CAA TTA 798 

FIG. lA-l 
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2185 GAT GAT TGT AAT TCA CGT AAA TCA CAA TGT GGA AAC TTT AAT 2226 
DDCNSRKSQCGNFN 

2227 GGT AAA TGT ATT AAA GGC AGT GAC AAT TCT TAT TCT TGT GTA 2268 
GKCIKGSDNSYSCV 

2269 TTT GAA AAA GAT AAA ACT TCT TCT AAA TCA GAT AAT GAT ATT 2310 
FEKDKTS SKSDNDI 

2311 TGT GCT GAA TGT TCT AGT TTA ACA TGT CCA GCT GAT ACT ACA 2352 
CAECSSLTCPADTT 

2353 TAC AGA ACA TAT ACA TAT GAC TCA AAA ACA GGA ACA TGT AAA 2394 
YRTYTYDS KTG TCK 

2395 GCA ACT GTT CAA CCA ACA CCA GCA TGT TCA GTA TGT GAA AGT 2436 
ATVQPTPACSVCES 

2437 GGT AAA TTT GTA GAG AAA TGC AAA GAT CAA AAA TTA GAA CGT 2478 
G KFVEKCKDQKLER 

2479 AAA GTC ACT TTA GAA AAT GGA AAA GAA TAT AAA TAC ACC ATT 2520 
KVTLEN GKEY KYTI 

2521 CCA AAA GAT TGT GTC AAT GAA CAA TGC ATT CCA AGA ACA TAC 2562 
PKDCVNEQCIPRTY 

2563 ATA GAT TGT TTA GGT AAT GAT GAT AAC TTT AAA TCT ATT TAT 2604 
IDCLG NDDNFKSIY 

2605 AAC TTC TAT TTA CCA TGT CAA GCA TAT GTT ACA GCT ACC TAT 2646 
NFYLPCQAYVTATY 

2647 CAT TAC AGT TCA TTA TTC AAT TTA ACT AGT TAT AAA CTT CAC 2688 
HYSSLFNLTSYKLH 

2689 TTA CCA CAA AGT GAA GAA TTT ATG AAA GAG GCA GAC AAA GAA 2730 
LPQS E EFM KEADKE 

2731 GCA TAT TGT ACA TAC GAA ATA ACA ACA AGA GAA TGT AAA ACA 2772 
A Y C T Y E I T T R E C K T 

2773 TGT TCA TTA ATT GAA ACT AGA GAA AAA GTC CAA GAA GTT GAT 2814 
CSLIETREKVQEVD 

2815 TTG TGT GCA GAA GAA ACT AAG AAT GGA GGA GTT CCA TTC AAA 2856 
LCAEETKNGGVPFK 

2857 TGT AAG AAT AAC AAT TGC ATT ATT GAT CCT AAC TTT GAT TGT 2898 
CKNNNCIIDPNFDC 

2899 CAA CCT ATT CiAA TGT AAG ATT CAA GAG ATT GTT ATT ACA GAA 2940 
QPIECKIQEIV -ITE 

2941 AAA GAT GGA ATA AAA ACA ACA ACA TGT AAA AAT ACT ACA AAA 2982 
KDGIKTTTCKNTTK 
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2983 GCA ACA TGT GAC ACT AAC AAT AAG AGA ATA GAA GAT GCA CGT 3024 
ATCDTNNKRIEDAR 

3025 AAA GCA TTC ATT GAA GGA AAA GAA GGA ATT GAG CAA GTA GAA 3066 
KA^F I EG KEG I EQV-E 

3067 TGT GCA AGT ACT GTT TGT CAA AAT GAT AAT AGT TGT CCA ATT 3108 
CASTVCQNDNSCPI 

3109 ATT ACT GAT GTA GAA AAA TGT AAT CAA AAC ACA GAA GTA GAT 3150 
ITDV EKCNQNTEVD 

3151 TAT GGA TGT AAA GCA ATG ACA GGA GAA TGT GAT GGT ACT ACA 3192 
YGCKAMTGECDGTT 

3193 TAT CTT TGT AAA TTT GTA CAA CTT ACT GAT GAT CCA TCA TTA 3234 
YLCKFVQLT DDPSL 

3235 GAT AGT GAA CAT TTT AGA ACT AAA TCA GGA GTT GAA CTT AAC 3276 
DSEHFRTKSGVELN 

3277 AAT GCA TGT TTG AAA TAT AAA TGT GTT GAG AGT AAA GGA AGT 3318 
NACLKYKCVESKGS 

3319 GAT GGA AAA ATC ACA CAT AAA TGG GAA ATT GAT ACA GAA CGA 3360 
DGKITHKWEI DTER 

3361 TCA AAT GCT AAT CCA AAA CCA AGA AAT CCA TGC GAA ACC GCA 3402 
SNANPKPRNPCETA 

3403 ACA TGT AAT CAA ACA ACT GGA GAA ACT ATT TAG ACA AAG AAA 3444 
TCNQ TTGETIYTKK 

3445 ACA TGT ACT GTT TCA GAA TTC CCA ACA ATC ACA CCA AAT CAA 3486 
TCTVSEFPTITPNQ 

3487 GGA AGA TGT TTC TAT TGT CAA TGT TCA TAT CTT GAC GGT TCA 3528 
GRCFYCQCSYLDGS 

3529 TCA GTT CTT ACT ATG TAT GGA GAA ACA GAT AAA GAA TAT TAT 3570 
SVLTMYGE TDKEYY 
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-15 MKLLL LNILLLCCLA DKLDEFSADN DYYDGGIMSR GKNAGSWYHS 

31 YTHQYDVFYY LAMQPWRHFV WTTCDKNDNT ECYKYTINED HNVKVEDINK 
81 TNIKQDFCQK EYAYPIEKYE VDWDNVPVDE QRIESVDING KTCFKYAAKR 
131 PLAYVYLNTK MTYATKTEAY DVCRMDFIGG RSITFRSFNT ENKAFIDQYN 
181 TNTTSKCLLN VYDNNVNTHL AIIFGITDST VIKSLQENLS LLSQLKTVKG 
231 VTLYYLKDDT YFTVNITLDQ LKYDTLVKYT AGTGQVDPLI NIAKNDLATK 
281 VADKSKDKNA NDKIKRGTMI VLMDTALGSE FNAETEFDRK NISVHTWLN 
331 RNKDPKITRS ALRLVSLGPH YHEFTGNDEV NATITALFKG IRANLTERCD 
381 RDKCSGFCDA MNRCTCPMCC ENDCFYTSCD VETGSCIPWP KAKPKAKKEC 
431 PATCVGSYEC RDLEGCWTK YNDTCQPKVK CMVPYCDNDK NLTEVCKQKA 
481 NCEADQKPSS DGYCWSYTCD QTTGFCKKDK RGKEMCTGKT NNCQEYVCDS 
531 EQRCSVRDKV CVKTSPYIEM SCYVAKCNLN TGMCENRLSC DTYSSCGGDS 
581 TGSVCKCDST TGNKCQCNKV KNGNYCNSKN HEICDYTGTT PQCKVSNCTE 
631 DliVRDGCLIK RCNETSKTTY WENVDCSNTK lEFAKDDKSE TMCKQYYSTT 
681 CLNGKCWQA VGDVSNVGCG YCSMGTDNII TYHDDCNSRK SQCGNFNGKC 
731 IKGSDNSYSC VFEKDKTSSK SDNDICAECS SLTCPADTTY RTYTYDSKTG 
781 TCKATVQPTP ACSVCESGKF VEKCKDQKLE RKVTLENGKE YKYTIPKDCV 
831 NEQCIPRTYI DCLGNDDNFK SIYNFYLPCQ AYVTATYHYS SLFNLTSYKL 
881 HLPQSEEFMK EADKEAYCTY EITTRECKTC SLIETREKVQ EVDLCAEETK 
931 NGGVPFKCKN NNCIIDPNFD CQPIECKIQE IVITEKDGIK TTTCKNTTKA 
981 TCDTNNKRIE DARKAFIEGK EGIEQVECAS TVCQNDNSCP IITDVEKCNQ 
1031 NTEVDYGCKA MTGECDGTTY LCKFVQLTDD PSLDSEHFRT KSGVELNNAC 
1081 LKYKCVESKG SDGKITHKWE IDTERSNANP KPRNPCETAT CNQTTGETIY 
1131 TKKTCTVSEF PTITPNQGRC FYCQCSYLDG SSVLTMYGET DKEYYDLDAC 
1181 GNCRVWNQTD RTQQLNNHTE CILAGEINNV GAIAAATTVA AVIVAVWAL 
1231 IWSIGLFKT YQLVSSAMKN AITITNENAE YVGADNEATN AATFNG 
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10/ n 

1 TTC TGT TAA ATA GGA AAG GCA AGT GAT TTA AAC AAG AC A ATG 42 



43 AAC TAG AAA GAC AAA GAT ATG 

M 

85 TTA TTA TTA TGT TGT CTT GCA 
L L L C C L A 

127 GCA GAT ATT GAT TAT TAT GAC 
A D -=I D Y Y D 

169 AAG AAT GCA GGT TCA TGG TAT 
K N A G S W Y 

211 GAT GTT TTC TAT TAT TTA GCT 
D V F Y Y L A 

253 GTA TGG ACT ACT TGT ACA ACA 

V W T T C T T 

295 TAT AAA TAT ACT ATC AAT GAA 

Y K Y T I N E 

337 GAT ATT AAT AAA ACA GAT ATT 
D I N K T D I 

379 GAA TAT GCA TAT CCA ATT GAA 
E Y A Y P I E 

421 AAT GTT CCA GTT GAT GAA CAA 
N V P V D E Q 

463 AAT GGA AAA ACT TGT TTT AAA 
N G K T C F K 

505 GCT TAT GTT TAT TTA AAT ACA 
A Y V Y L N T 

547 ACT GAA GCA TAT GAT GTT TGT 
T E A y D V C 

589 AGA TCA ATT ACA TTC AGA TCA 
R S I T F R S 

631 TTT ATT GAT CAA TAT AAT ACA 
F I D Q Y N T 

673 CTT AAA GTA TAT GAT AAT AAT 
L K V y D N N 

715 ATC TTT GGT ATT ACT GAT TCT 
I F G I T D S 



AAA TTA TTA TTA TTA AAT ATC 84 
K L L L L N I 

GAT AAA CTT AAT GAA TTT TCA 126 
D K L N E F S 

CTT GGT ATT ATG TCT CGT GGA 168 
L G I M S R G 

CAT TCT TAT GAA CAT CAA TAT 210 
H S Y E H Q Y 

ATG CAA CCA TGG AGA CAT TTT 252 
M Q P W R H F 

ACT GAT GGC AAT AAA GAA TGT 294 
T D G N K E C 

GAT CAT AAT GTA AAG GTT GAA 336 
D H N V K V E 

AAA CAA GAT TTT TGT CAA AAA 378 
K Q D F C Q K 

AAA TAT GAA GTT GAT TGG GAC 420 
K Y E V D W D 

CGA ATT GAA AGT GTA GAT ATT 462 
R I E S V D I 

TAT GCA GCT AAA AGA CCA TTG 504 
y A A K R P L 

AAA ATG ACA TAT GCA ACA AAA 546 
K M T y A T K 

AGA ATG GAT TTC ATT GGA GGA-^588 
R M D F I G G 

TTT AAC ACA GAG AAT AAA GCA 630 
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AAC ACT ACA TCA AAA TGT CTT 672 
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1513 gat aag aat cta act gaa gta tgt aaa caa aaa gct aat tgt 1554 
dknltevckqkanc 

1555 gaa gca gat caa aaa cca agt tct gat gga tat tgt tgg agt 1596 
eadqkpssdgycws 

1597 tat ac a tgt gac caa act act ggt ttt tgt aag aaa gat aaa 1638 
ytcdqttgfckkdk 

163 9 cgt ggt gaa aat atg tgt aca gga aag aca aat aac tgt caa 1680 
r g^e n m c t g k t n n c'q 

1681 GAA TAT GTT TGT GAT GAA AAA CAA AGA TGT ACT GTT CAA GAA 1722 
EYVCD .EKQRCTVQE 

1723 AAG GTA TGT GTA AAA ACA TCA CCT TAT ATT GAA ATG TCA TGT 17 64 
KVCVKTSPYIEMSC 

1765 TAT GTA GCC AAG TGT AAT CTC AAT ACA GGT ATG TGT GAG AAC 1806 
YVAKCNLNTGMCEN 

1807 AGA TTA TCA TGT GAT ACA TAC TCA TCA TGT GGT GGA GAT TCT 1848 
R L S C D T V S S C G G D S 

1849 ACA GGA TCA GTA TGT AAA TGT GAT TCT ACA ACT AAT AAC CAA 1890 
TGSVCKCDSTTNNQ 

1891 TGT CAA TGT ACT CAA GTA AAA AAC GGT AAT TAT TGT GAT TCT 1932 
CQCTQVKNGN YCDS 

1933 AAT AAA CAT CAA ATT TGT GAT TAT ACA GGA AAA ACA CCA CAA 1974 
NKHQICDYT GKTPQ 

1975 TGT AAA GTG TCT AAT TGT ACA GAA GAT CTT GTT AGA GAT GGA 2016 
CKVSNCTEDLVRDG 

2017 TGT CTT ATT AAG AGA TGT AAT GAA ACA AGT AAA ACA ACA TAT 2058 
CLIKRCNETSK TT Y 

2059 TGG GAG AAT GTT GAT TGT TCT AAA ACT GAA GTT AAA TTC GCT "2100 
WENVDCSKTEVKFA 

2101 CAA GAT GGT AAA TCT GAA AAT ATG TGT AAA CAA TAT TAT TCA 21.42 
QDGKS ENMCKQYYS 

2143 ACr ACA TGT TTG AAT GGA CAA TGT GTT GTT CaA GCA GTT GGT 2184 
TTC LNGQCVVQAVG 

2185 GAT GTT TCT AAT GTA GGA TGT GGA TAT TGT TCA ATG GGA ACA 2226 
DVSNVGCGYCSMGT 

2227 GAT AAT ATT ATT ACA TAT CAT GAT GAT TGT AAT TCA CGT AAA 2268 
DNIITYHDDCNSRK 
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3025 TGT AAA AAT ACC ACA AAA ACA ACA TGT GAC ACT AAC AAT AAG 3066 
CKNTTKTTCDTNNK 

3067 AGA ATA GAA GAT GCA CGT AAA GCA TTC ATT GAA GGA AAA GAA 3108 
RIEDARKAFIEGKE 

3109 GGA ATT GAG CAA GTA GAA TGT GCA AGT ACT GTT TGT CAA AAT 3150 
GIEQVECASTVCQN 

3151 GAT AAT AGT TGT CCA ATT ATT ACT GAT GTA GAA AAA TGT AAT 3192 

dn-scpiitdvekc"n 

3193 CAA AAC ACA GAA GTA GAT TAT GGA TGT AAA GCA ATG ACA GGA 3234 
QNTEV.DYGCKAMTG 

3235 GAA TGT GAT GGT ACT ACA TAT CTT TGT AAA TTT GTA CAA CTT 3276 
ECDGTTYLCKFVQL 

3277 ACT GAT GAT CCA TCA TTA GAT AGT GAA CAT TTT AGA ACT AAA 3318 
TDDPSLDSEHFRTK 

3319 TCA GGA GTT GAA CTT AAC AAT GCA TGT TTG AAA TAT AAA TGT 3360 
SGVELNN ACLKYKC 

3361 GTT GAG AGT AAA GGA AGT GAT GGA AAA ATC ACA CAT AAA TGG 3402 
VESKGSDGKITHKW 

3403 GAA ATT GAT ACA GAA CGA TCA AAT GCT AAT CCA AAA CCA AGA 3444 
EIDTERSNANPKPR 

3445 AAT CCA TGC GAA ACC GCA ACA TGT AAT CAA ACA ACT GGA GAA 3486 
NPCETATCNQT TGE 

3487 ACT ATT TAC ACA AAG AAA ACA TGT ACT GTT TCA GAA GAA TTC 3528 
T IYTKKTCTVSEEF 

3529 CCA ACA ATC ACA CCA AAT CAA GGA AGA TGT TTC TAT TGT CAA 3570 
PTIT PNQGRCFYCQ 

3571 TGT TCA TAT CTT GAC GGT TCA TCA GTT CTT ACT ATG TAT GGA 3612 
CSYLDGSSVLTMYG 

3613 GAA ACA GAT AAA GAA TAT TAT GAT CTT GAT GCA TGT GGT AAT 3654 
ETDKEYYDLDACGN 

3655 TGT CGT GTT TGG AAT CAG ACA GAT AGA ACA CAA CAA CTT AAT 3696 
C RVWNQTDRTQQLN 

3697 AAT CAC ACC GAG TGT ATT CTC GCA GGA GAA ATT AAT AAT GTT 3738 
NHTECILAGEINNV 

3739 GGA GCT ATT GCA GCG GCA ACT ACT GTG GCT GTA GTT GTA GTT 3780 
G AI AAATTVAVVVV 
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3781 GCA GTC GTA GTT GCA TTA ATT GTT GTT TCT ATT GGA TTA TTT 3822 
AVVVALIVVSIGLF 

3823 AAG ACT TAT CAA CTT GTT TCA TCA GCT ATG AAG AAT GCC ATT 3864 
KTYQLVSSAMKNAI 

3865 AC A ATA ACT AAT GAA AAT GCA GAA TAT GTT GGA GCA GAT AAT 3906 
TITNENAEYVG ADN 

3907 GAA GCA ACT AAT GCA GCA ACA TTC AAT GGA TAA GAA CAA TAA 3948 
EATNAATFNGZ 

3949 TTA AGA GAA TTG AAT AAC ATT TTA TGT TTT TAG ATT AAA AAT 3990 
3991 AAA AAG AAG AAT AAA TTG AGT GAT AAA CAA TGA ATA AAA TAA 4032 



4033 ATA AAA ATA AAC AAG AAT AAA GTG AAC ATC ATT TTT ATT TTC 



407/ 



4075 ATA TTT TAA CAA CAC T 4090 
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